Configuration Suggestions for Multiple Integrations/Services

mmartin · May 15, 2020, 2:14pm

Hello All,

I was wondering if someone would be able to suggest the best method for the following.

Basically, there’s 3 Users who will receive notifications for Nagios. Our initial thoughts were to create 3 Nagios Services OR 3 Integrations. Then, in Nagios, we would have 3 unique PagerDuty contacts which we’ll use one for each of us and assign that contact to our own Hosts and/or Services so the right person is notified when a Problem notification is sent.

So, I’m wondering if the best way to do this is to create one Nagios “Service”, with 3 “Integrations” within that service… Or, should we create 3 Nagios Services with a single Integration within each service, one for each of us.

Basically, we’ll all be on-call 24x7 to receive notifications, as of now. But, I’m sure we’ll each want to tweak certain things like ack timeout, and things like that, per user… So maybe whichever method above gives us the most flexibility/scalability.

Any ideas on which method would be the best way to achieve this? Any thoughts or suggestions would be greatly appreciated.

Thanks in Advance,
Matt

jay · May 15, 2020, 2:14pm

Hey Matt! Thanks for waiting! My best advice would be to create 3 Nagios Services in your account with a single Integration within each service. If you had created a single service with 3 integrations tied to that service, you would not be able to select which user gets notified. Also, another option you have is to utilize our a feature in the UI called “Response Plays”. A response play can be used directly by on-call engineers during triage when they determine something they’ve been paged for is bigger than expected or belongs to another teammate based on the problems notification. Here is a link (bottom of response) to the guide if you would like to take a look. Please let me know if you have another questions or concerns and I will be more than happy to support you.

simonfiddaman · May 15, 2020, 2:14pm

Hey @mmartin,

You’ll definitely want at least 3 PagerDuty Services with one Nagios Integration each (and hey, why not add Extensions so your alerts can be visible in e.g. Slack!).

Services are linked to one Escalation Policy each, and that’s where you’ll determine who receives the page (based on which Schedules are part of the EPs). This is just an expansion of what @jay said:

If you had created a single service with 3 integrations tied to that service, you would not be able to select which user gets notified.

You can extend this again and configure multiple PagerDuty Services (which re-use your existing Escalation Policies, Schedules, etc.) and then differentiate based on host/service alert importance such that e.g. you may have a PagerDuty Service for your DB service which is always High Urgency, and another for, say, your logging cluster which is important, yes, but won’t take the customers out of service (plus it’s a cluster so it can tolerate a bit of failure) which you configure as Low Urgency (or use some Severity-based Event Rules to treat all but CRITICAL as Low Urgency, raising the Urgency only when it reaches your critical threshold).

Configuring multiple PagerDuty Services gives you more flexibility - you can decide to handle on-call pages in a different way for a subset of your infrastructure / alerts, or temporarily hand some alerts over to someone else, set a maintenance window on just those services, etc. Because the Integration owns the Nagios contact Key, you can also move these about between Services, or have e.g. different Integrations for the same Service that are used by different monitoring locations, but they’ll show up as part of the same Service.

Good luck,
@simonfiddaman